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Description 

[0001] This invention relates generally to an image processing technique, and more particularly to the automatic 
segmentation and characterization of a plurality of image objects placed on the platen of an image input device. 
5 [0002] Given a single scanned image composed of several separate photographs laid side by side, preferably not 
touching, on the scanner platen, it is desired to identify automatically the position, shape and rotation angle of each 
original photograph. Such a capability can enhance productivity by decreasing the time required for scanning multiple 
images and by automating alignment corrections. 

[0003] Heretofore, a number of patents and publications have disclosed image segmentation and structured images, 

10 the relevant portions of which may be briefly summarized as follows: 

[0004] US-A-5,485,568 to Venable et al., issued January 16, 1 996, and hereby incorporated by reference, discloses 
a method and apparatus for representing a complex color raster image as a collection of objects in a structured image 
format - a hierarchical, device-independent format. A structured image document, generated using the techniques 
described by Venable, is a representation of data that may be rendered into a raster image. The data includes simple 

is raster images as well as a hierarchical collection of subobjects and raster processing operations. The possible data 
types for objects in the structured image include a raster image, text, graphics, image processing description, and files 
containing multiple image representations 

[0005] In "MANAGING AND REPRESENTING IMAGE WORKFLOW IN PREPRESS APPLICATIONS", Technical 
Association of the Graphic Arts (TAG A) Vol. 1, 1995 Proceedings pp. 373-385, hereby incorporated by reference for 
20 its teachings, Venable et al. teach the use of structured images to manage prepress workflow. An operation such as 
gang scanning is described as a means for capturing several photographs roughly aligned on a scanner platen. 
[0006] In accordance with the present invention, there is provided a method for processing a digital input image to 
characterize a plurality of objects therein, comprising: 

2B identifying at least two objects within the input image; 

modeling a shape representing boundaries of each of the objects; and 

creating a description to characterize the objects, wherein the description may further characterize other attributes 
of the image. 

30 [0007] In accordance with another aspect of the present invention, there is provided an image processing apparatus, 
including a programmable computer capable of receiving a digitized input image, the computer including a frame buffer 
memory for storing the input image and program memory for the storage of code suitable for causing the computer to 
execute image processing operations including: 

3S identifying a plurality of objects within the digitized input image; 

modeling a shape representing boundaries of the object; and 
creating a description to characterize the object. 

[0008] The present invention is directed to a software-based system developed to accomplish the automatic deter- 
40 mination of independent regions or segments within a scanned image. The present invention combines a number of 
graphics and image processing techniques into an automated, user-friendly application for productivity enhancement. 
The application can enhance productivity by decreasing the time required for scanning multiple images, by automating 
corrections for alignment of multiple images, and even automatically placing multiple images in a document template. 
[0009] The present invention accomplishes these objectives by: 

45 

1 ) locating a plurality of independent objects within the image 

2) modeling the shape of the identified objects (e.g., rectangle) 

3) creating a structured image description identifying the location, shape and orientation of each object within the 
image. 

so 

[0010] One aspect of the invention deals with a basic problem in digital image processing, that of identifying plural 
objects within a digitized image. This aspect is further based on the discovery of image processing techniques that 
alleviate this problem. The techniques described herein enable a user to expediently scan a plurality of documents in 
a single scanning operation, and then automatically separate those documents by recognizing them as independent 
55 objects within the digitized image. Another aspect of the present invention allows for the automatic creation of a struc- 
tured image representation of the digitized image so that the image objects may be easily extracted and further proc- 
essed, independently. 

[001 1] The techniques described above are advantageous because they improve the efficiency of a scanning proc- 
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ess, allowing multiple original documents to be scanned at one time. In addition, the techniques allow for the automatic 
characterizing physical attributes (e.g., location, shape and orientation) of the objects without user intervention. 
[001 2] An example of the present invention will now be described with reference to the accompanying drawings, in 
which:- 

5 

Figure 1 is an illustration of the equipment that forms an image processing system serving as one embodiment for 
the present invention; 

Figure 2 is a block diagram of the various components comprising the system of Figure 1; 
Figure 3 is a flowchart illustrating the general processing steps carried out on the system of Figures 1 and 2 in 
io accordance with the present invention; 

Figures 4-7 are detailed flow charts illustrating the processing steps carried out in accordance with various em- 
bodiments of the present invention; 

Figure 8 is an illustrative example of a portion of a digital document; 

Figure 9 is an illustration of the output of the system of Figure 1 when an input image is processed in accordance 
is with the present invention; and, 

Figure 10 is an exemplary user interface screen associated with one embodiment of the present invention. 

[001 3] For a general understanding of the present invention, reference is made to the drawings. In the drawings, like 
reference numerals have been used throughout to designate identical elements, rn describing the present invention, 

20 [he following term(s) have been used in the description. 

[0014] The teim 'data'* refers herein to physical signals that indicate or include information. When an item of data 
can indicate one of a number of possible alternatives, the item of data has one of a number of "values." For example, 
a binary item of data, also referred to as a "bit," has one of two values, interchangeably referred to as "1 " and "0" or 
■ON" and "OFF" or 'high" and "low". A bit is an "inverse" of another bit if the two bits have different values. An N-bit 

25 item of data has one of 2N values. A "multi-bit" item of data is an item of data that includes more than one bit. 

[001 5] "Memory circuitry" or "memory" is any circuitry that can store data, and may include local and remote memory 
and input/output devices. Examples include semiconductor ROMs, RAMs, and storage medium access devices with 
data storage media that they can access. A "memory cell" is memory circuitry that can store a single unit of data, such 
as a bit or other n-ary digit or an analog value. 

30 [0016] A signal "indicates" or "selects" one of a set of alternatives if the signal causes the indicated one of the set 
of alternatives to occur. For example, a signal can indicate one bit set in a sequence of bit sets to be used in an 
operation, in which case the signal causes the indicated bit set to be used in the operation. 

[0017] An "image" is a pattern of physical light. An image may include characters, words, and text as well as other 
features such as graphics. A text may be included in a set of one or more images, such as in images of the pages of 
35 a document. An image may be processed so as to identify specific "objects" within the image, each of which is itself 
an image. A object may be of any size and shape and has physical attributes or characteristics including, but not limited, 
to position, shape and orientation. 

[001 8] An item of data "defines" an image when the item of data includes sufficient information to produce the image. 
For example, a two-dimensional array can define all or any part of an image, with each item of data in the array providing 
40 a value indicating the color of a respective location of the image. 

[001 9] An item of data "defines" an image set when the item of data includes sufficient information to produce all the 
images in the set 

[0020] Each location in an image may be called a "pixel". In an array defining an image in which each item of data 
provides a value, each value indicating the color of a location may be called a "pixel value". Each pixel value is a bit 
45 in a "binary form" of an image, a gray scale value in a "gray scale form" of an image, or a set of color space coordinates 
in a "color coordinate form" of an image, the binary form, gray scale form, and color coordinate form each being a two- 
dimensional array defining an image. 

[0021] An operation performs "image processing" when it operates on an item of data that relates to part of an image. 
[0022] Pixels are "neighbors" or "neighboring" within an image when there are no other pixels between them and 

50 they meet an appropriate criterion for neighboring. If the pixels are rectangular and appear in rows and columns within 
a two-dimensional image, each pixel may have 4 or 8 neighboring pixels, depending on the criterion used. 
[0023] An "edge" occurs in an image when two neighboring pixels have sufficiently different pixel values according 
to an appropriate criterion for the occurrence of an edge between them. The terms "edge pixel" or "boundary pixel" 
may be applied to one or both of two neighboring pixels between which an edge occurs. 

ss [0024] An "image characteristic" or "characteristic" is a measurable attribute of an image. An operation can "measure" 
a characteristic by producing data indicating the characteristic using data defining an image. A characteristic is meas- 
ured for an image" if the characteristic is measured in a manner that is likely to produce approximately the same result 
each time it occurs. 
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[0025] A Version 0 of a first image is a second image produced using an item of data defining the first image. The 
second image may be identical to the first image, or it may be modified by loss of resolution, by changing the data 
defining the first image, or by other processes that result in modifying pixel values of the first image. 
[0026] An "image input device" is a device that can receive an image and provide an item of data defining a version 
of the image. A "scanner" is an image input device that receives an image by a scanning operation, such as by scanning 
a document. 

[0027] An "image output device" is a device that can receive an item of data defining an image and provide or render 
the image as output. A "display" is an image output device that provides the output image in human viewable form, 
and a "printer" is an image output device that renders the output image in a human viewable, hard copy form. 
[0028] Referring nowto Figures 1 and 2, depicted therein is a system 20 in which the present invention finds particular 
use. System 20 includes a computer 22 capable of receiving digital data representing an image of an original document 
24 placed upon a platen of scanner 26. Computer 22, initially stores the digital input data from scanner 26 in memory 
52 (e.g. : RAM or magnetic disk storage) where the image may subsequently be accessed. In addition to the digital 
data, memory 52 may also inciude program memory for the storage of object code suitable for directing the processor 
to execute image processing operations in accordance with the invention described herein. Computer 22 has associ- 
ated therewith a user interface (U/l) 28 including one or more user input devices 30, such as a keyboard, a keypad, a 
mouse, trackball, stylus or equivalent pointing device, etc. 

[0029] Also part of system 20 is an image output device such as printer 34 which may include a laser-driven, xero- 
graphic printing engine as found in a number of commercially available printers. In a preferred embodiment, system 
20 is employed to process the digital image data received as input from a scanner 26, utilizing image processing 
software running in processor 50, so as to produce an output file that may be rendered by printer 34, stored in memory 
50, and/or transmitted to another device via network 40. It will be appreciated that the document placed upon the 
scanner platen may include a plurality of photographs or other objects represented by marks on a substrate surface, 
or that such objects may be scanned by a single scanning operation. For example, a particular embodiment to which 
the following description will be directed is a single scanned image representative of several separate photographs 
laid side by side on the platen of scanner 26, but not touching or overlapping. In accordance with the present invention 
it is desired to automatically identify the position, shape and rotation angle of each original photograph. 
[0030] Given an input image generated by scanning several separate photographs laid side by side on the scanner 
platen, the present invention automatically identifies at least the por-ition, shape and orientation angle of each photo- 
graph. As shown in the flow chart of Figure 3, the process carried out by computer 22 during the processing of the 
input image includes three general steps. First, at step 100 the objects within the image are located and boundaries 
of the object are generally identified. Once the objects are located, the shape of the objects is modeled at step 200. 
Having located the objects and modeled their shape, a structured image representing the image and objects therein 
can be created as represented by step 300. The structured image preferably includes data representing not only the 
image data itself, but data representing the location, shape or orientation of each object, or some combination thereof. 
Alternatively, the output may be a page description language format or equivalents formats suitable for storing the 
image information in a retrievable form. 

[0031] In a preferred embodiment of the present invention, the scanned input image (or a lower resolution version 
thereof) is loaded into a memory frame buffer (RAM) where it is analyzed in accordance with the previously described 
steps. For purposes of the following detailed description, it is assumed that objects do not occlude one another and 
that the background of the image is contiguous. These simplifying assumptions are intended for purposes of explanation 
only and are not intended as limitations of the invention. One skilled in the art will appreciate that the invention described 
herein is extensible so as not to require operation only within the boundaries of such assumptions. 
[0032] As depicted by the flow chart of Figure 4, the object location step 100 is performed by first identifying the 
background region of the input image 102, characterizing the background region 104, and then using the characteristic 
of the background region as a seed, identifying all the pixels representing the background region with an adaptive seed 
fill algorithm 1 06. Background pixels are pixels not associated with any objects, or more simply, they are pixels repre- 
sentative of those regions lying outside of the objects, the values of which are controlled by the "background" against 
which the objects are placed during scanning (e.g., the underside of the platen cover). One embodiment employs the 
average color of a small region in the upper left-hand corner of the scanned image as an initial estimate of the back- 
ground color. Alternatively, other sampling operations may be employed to determine the background color as de- 
scribed, for example, in US-A-5,282,091 for a Programmable Apparatus for Determining Document Background Level 
by Farrell. 

[0033] Once the background color is characterized at step 1 04, an adaptive algorithm is preferably applied to control 
the background color and to accurately identify the objects. An example of a seed fill algorithm suitable for use in the 
present invention is described in Graphics Gems I, A. Glassner Ed, Academic Press, pp. 275-277, 1990, hereby 
incorporated by reference. An adaptive algorithm is required because the background pixels may have significant color 
variation resulting from a variation in illumination over the platen area. The adaptive seed fill algorithm is applied to 
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the scanned color image data using an initial seed point characterized by the background, for example, the upper-left 
corner of the image. Generally, the adaptive seed fill algorithm fills a binary frame buffer with a mask indicating all 
contiguous pixels identified as background pixels. In a simple embodiment, represented by step 1 1 2, a pixel is consid- 
ered to be a background pixel if its color falls within a small distance e of the current average background pixel value. 
5 This distance is calculated as an Euclidean metric in red, green, blue (RGB) color space 

d = SORT ( (P r -AdAvg/+(P g -AdAvg g ) 2 +(P b -AdAvg b ) 2 ). 

10 where P k , AdAvg k are, respectively, the RGB components of the pixel under test and the average background value, 

and d \$ the distance measurement. The value of e is fixed and empirically determined in one embodiment. The test 

conducted at step 112 is: 

if d<e, then pixel P is a background pixel, else pixel P is a foreground pixel. 

[0034] The average background color is adaptively modified at step 114 by taking the average value of the last N 
is pixels that have been classified as background. For efficiency, the system preferably calculates the adaptive average 

using the equation: 

AdAvg* = (N*AdAvg-AdAvg+LastVal)/N, 

20 

where AdAvg' is the modified average, AdAvg is the previous adaptive average, LastVal is the value of the last pixel 
identified as background, and N is the averaging window. Clearly, this is not a true running average, but it tracks the 
running average adequately and is more computationally efficient than a strict running average calculation. Alterna- 
tively, the value of e can be adaptively modified. For example, c might be based on the standard deviation of the last 

25 several pixels identified as background, etc. 

[0035] Having identified all background pixels and created a binary mask representative of the background regions, 
the process at step 120 is executed to smooth noisy edges in the background mask using morphological filtering. More 
specifically a morphological closure filter is preferably applied to the background mask to eliminate single pixel noise 
and to smooth object edges. Subsequently, contiguous foreground regions are located, step 122, thereby identifying 

30 the objects. Objects are identified by scanning the background mask generated by the adaptive seed fill operation 
(step 106). Starting with the upper left hand pixel, the mask is searched in a scan line fashion for a pixel not classified 
in the mask as a background pixel - thus identifying pixels associated with a foreground object. The use of the seed 
fill algorithm for identifying the background assures that foreground objects are closed. 

[0036] At step 1 24, the boundary of an object is identified by tracing its edge. The boundary of the foreground object 
35 js traced using a simple 8-connected edge traversal which provides an ordered-set of points tracing the edge of the 
object. Such an edge traversal operation employs a contour tracing operation to generate a chain code in a manner 
similar to word or character based recognition systems. An 8-connected process is described, for example, by R. 
Bozinovic et al. in 'Off-Line Cursive Script Word Recognition", IEEE Transactions on Pattern Analysis and Machine 
Intelligence, Vol. 11, No. 1 (January 1989). Once the edge is traced, all pixels associated with the object in the mask 
40 are marked as background so they will not be processed a second time, the object is added to the foreground object 
list and then the scanning of step 122 is continued as indicated by test step 126. Subsequent to completing the fore- 
ground scanning to identify all objects, a review of the identified objects may be completed as represented by step 
1 30. In many cases, the scanned image may contain undesirable foreground objects; such objects can be eliminated 
from the object list at this step. In one embodiment, the review of the object list may simply eliminate small objects as 
45 unlikely images. For example, in a scan of a yearbook page each image has associated with it a text caption that is 
not to be classified as image data. Such captions consist of many, small perimeter objects, so that by measuring the 
perimeter length of the traced edges, it is possible to eliminate objects having a perimeter smaller than a specified 
length, where the threshold length may be predetermined empirically. 

[0037] Once the objects have been located, as described with respect to step 1 00, the next general step, step 200, 
50 is to model the shape of the object. For purposes of simplicity, the following description will treat rectangular-shaped 

objects, however, it will be appreciated that the description is extensible to other polygons and even to shapes having 

portions thereof represented by curves (e.g., circular or elliptical objects). The result or output from step 100 is preferably 

a set of edge traces, in the form of linked lists, that identify bounding pixels about each object within the scanned image. 

These traces can be used to extract each object, but orientation is not yet determined. To improve the quality of the 
55 object extraction, the object traces are fitted to a model shape. Orientation information, etc., may then be extracted 

from the fitted parameters. In the descrtoed embodiment the object traces are fit to a rectangular model, however, other 

shapes are possible. 

[0038] One method of fitting the edge traces to a rectangular shape is a least-squares approach to fit to a rectangle. 
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To accomplish the least-squares fitting, the edge trace is first decomposed into four sets of points, each corresponding 
to one of the four sides of the rectangular object. The decomposition into four sets of points can be accomplished in 
several ways as described below. 

[0039] The first method has two principal parts, (a) categorizing the edge points into a set of bins associated with a 
single line, and (b) performing recognition on the bins for rotated shapes. Relerring now to Figure 5, where the first 
decomposition method is depicted in detail, step 204 calculates the slope at each point along the edge trace. Step 204 
preferably accomplishes the slope angle calculation by performing a linear regression on a small window of neighboring 
edge points. For example, 2 points lying on either side of the edge point for which the slope is being determined. The 
angle of the line passing through the center of each point is determined using linear regression in a small window 
centered on each point. Each regression requires 4 additions per point in the window, plus 2 subtractions, 2 multipli- 
cations, and an arctangent calculation, however, the regression algorithm may be further optimized to remove most of 
the addition operations. In a preferred embodiment, which reduces the computational complexity, a sample of the edge 
pixels are employed for slope angle calculations and sorting, thereby reducing the number of calculations necessary 
to categorize the edge pixels. 

[0040] Next, at step 206, the process constructs a list of slope categories or bins. The slope categories are con- 
structed for each edge point by calculating the magnitude of the difference in the slope angle between the current point 
along the edge (e.g., point B in Figure 8) and the preceding point (e.g., point A in Figure 8). If the difference is less 
than the value TOLERANCE (determined empirically to be ± 5 degrees in one embodiment), then the point is assigned 
to the same slope category as the preceding point, otherwise a new slope category is created and the point is assigned 
lo il Referring lo Figure 8, the above-described process would assign points A, B and C to a first sbpe category, points 
D E F G and H to a second slope category and points I, J ... to yet another slope category. Finally, if the slope category 
for the last edge point has approximately the same slope angle as the first slope category, then all points within the 
first and last slope categories are joined together into a single category. 

[0041] Once the slope categories are established at step 206, and stored in a data structure, they are then sorted 
at step 208 and ordered according to the number of edge points assigned to each category. For rectangular objects, 
the top four slope categories, those containing the most edge points, should correspond to points along the four edges 
of the rectangle. The top slope categories are then selected at step 210. It will be appreciated that one would use the 
top six categories for hexagonal objects, and similarly the top three categories for triangular objects, etc. 
[0042] Alternatively, steps 208 and 21 0 may be replaced by a step that processes the slope angle categories or bins 
by simple, or even statistical elimination, wherein those categories with few entries are removed. For example, an 
empirically determined threshold of 5 pixels may be applied so that only bins having more than 5 pixels with a common 
angle are kept. Subsequently, an average angle for a category may be determined using simple linear regression of 
all the points assigned to a particular category. With the average angle determined, af urther refinement of the categories 
would be possible, combining those categories having substantially common angles. In particular, each category is 
checked and if adjacent categories are substantially collinear, the categories are joined. Thus each of the remaining 
bins or categories represents a set of collinear points lying abng an edge. The edge points assigned to each of the 
remaining slope angle categories represent the edge trace decomposed into the four sides of the rectangle. It will be 
appreciated that this alternative is broadly directed to the process of "filtering" or refining the categories to identify 
those representing the actual edge of the objects. Accordingly, equivalent methods of accomplishing the refinement 
of the categories are contemplated. 

[0043] This first method of characterizing the object boundaries is computationally intensive due to the measurement 
of the average slope at each edge point. In the alternative embodiment mentioned previously, to improve speed, the 
edge trace may be sampled to reduce the total number of points that must be processed and categorized. 
[0044] It will be further appreciated that it may be possible, from an analysis of the ordered categories, to identify 
the shape. For example, a statistically significant difference in the number of points between a third and fourth category, 
or the complete lack of a forth category, are indicative of a triangular-shaped object. 

[0045] Referring to Figure 6, depicted therein is the second method by which the object shapes may be modeled. 
After retrieving the edge trace list data at step 202, step 252 calculates the center of mass of the object. Although there 
are a number of well-known methods for calculating the center of mass of the object, in the case of rectangular objects 
a straightforward approach would be averaging the (x.y) coordinates of the edge points. Next the edge point closest 
to the center of mass would be located at step 254. The closest point will be the approximate center of the long side 
of the rectangle. Referring again to Figure 8, the angle 9 from the center-of-mass (CofM) to the center point (LJ2) is 
the approximate rotation angle (9) of the rectangle. 

[0046] Once the rotation angle is determined, it is employed in step 256 to determine the approximate length of the 
minor axis of the rectangle at step 258. In particular, the distance from the center-of-mass to the average position of 
all edge points that fie in the angular range 9-AA to 9+AA is determined. This distance is an approximate measure of 
one-half the minor axis length of the rectangle. AA is an empirically determined value on the order of approximately 
5 degrees. Step 260 approximates the length of the major axis (LJ in much the same manner The distance from the 
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center-of-mass to the average position (9+90)+AA is an approximate measure of one-half the length of the major axis 
L a of the rectangle. Having approximated the orientation angle and the lengths of the major and minor axes, step 264 
calculates an angular range (as measured with respect to the centerof-mass) for each side of the rectangle that en- 
compasses only those edge points associated with that side: 



a) 



G' b = atan2(L a ,L b ) na| f an 9 ,e wjctth °f major axis; 



er a = 90-^ 

e b = G' b *TOL 
0 a = G' a *TOL 



half angle width of minor axis; 
where TOL=0.95 to avoid corners; 
where TOL=0.95 to avoid corners; 



and 
b) 

Range 
Range 2 : 
Range 3 : 
Range 4 : 



(9+6 b ) to (0-6 b ) 
((6+90)+e a ) to ((6+90)-6 a ) 
((6+18O)+0 b ) to ((6+180)-%) 
((6+270)+e a ) to ((6+270)-8 a ) 



[0047] Once the angular range is determined, step 266 finds all the edge points that lie within each of the four angular 
ranges (relative to the center-of-mass) determined above, thereby identifying the edge points corresponding to each 
side of the rectangle. It will be appreciated that this technique is less sensitive to edge-noise than the first method 
described above. 

[0048] Once the edge trace has been decomposed into four sets of points, each set corresponding to one of the four 
sides of the rectangle, a least squares calculation for fitting the points to rectangle is evaluated at step 280. A rectangle 
can be described as four mutually perpendicular lines defined by the equations: 

y=a 0 +px, 



35 



y=a 2 +px, 



40 



45 



y=a 3 +Yx, 



where py=-1. A least squares fit yields the fitted parameters: 



SO 



ss 



••(!-i-i(i-H-(i-j(l'-H 
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P=P„'Pc 
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where (x^,y to ) is the P edge point of the k* h side, and n k is the number of edge points associated with the hP* side. 
Subsequently, once the least squares fit yields the fitted parameters (|J, a r , a^), they are converted at step 282 
the into four coordinate pairs marking the corners of the rectangle. Moreover, the rotation angle of the rectangular 

25 object is accurately represented by the slope parameter p. 

[0049] Yet another alternative method for fitting the edge traces to a shape is a method employing binary moments 
for fast image bounding. Using the binary mask generated as described with respect to step 106 (e.g., the adaptive 
seed algorithm), or alternatively with a simple thresholding operation, the image is rendered in a binary bitmap form 
where each pixel value is a 0 or 1 indicating background or non-background regions. Once the borders are detected 

30 for an object using the binary mask, the alternative embodiment depicted in Figure 7 employs second-order binary 
moments to fit a shape (e.g., rectangle) to the object. 

[0050] Referring to Figure 7, depicted therein is a generalized flowchart representing the steps of the binary moment 
boundary finding technique. At step 100, the object edges are located and recorded as previously described, thereby 
providing as an input a linked list of boundary or edge pixels referred to as an edge trace, step 290. Using the boundary 
3S list, the second order moments are calculated (step 292) in an efficient manner using the equation: 



where p(ij) is the image pixel value at image coordinates (i,j) and p t (/) is the 7 th order moment of the scan line, (i, 
j) and p, (/) is the 7 th order moment of the scan line. Because the object boundary pixels are previously determined, 
45 the process can be simplified and the rightmost and left-most boundary pixels for a particular scanline are used for the 
1 st order (absolute) moment calculations. 

[0051] Subsequently, the 2nd order (central) moments (m 00 , m 01 , m 1Qt m 1p m 20> and m^) are calculated using the 
1 st order moments and the following equations: 



40 




so 




55 
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[0052] Having determined the 2nd order moments, they are employed to characterize an ellipse and from the ellipse 
10 the bounding box about the object, step 294. In particular, the center of the ellipse (x, y), the lengths of each axis (a 
and b) and the rotation angle (8) are determined. The bounding box for the rectangular object is determined as a 
rectangle centered at (x,y) with sides of length 2a and 2b, rotated by an angle 6. While this renders a bounding box 
slightly larger than the object, this is done so as to provide a safety margin for the calculation, and to avoid cropping 
a portion of the object. If a tighter bounding box is desired, the rectangle would be characterized with sides of length 
is 2aa and 2ab, where a is set equal to V3/2 or a slightly smaller value to accomplish edge trimming or cropping (e.g., 
on the order of one or more pixels). 

[0053] After each object has been modeled as a shape (e.g., rectangle), a structured image is created as described, 
for example, in US-A-5,485,568 to Venable et al. The structured image consists of one "child" structured image for 
each object detected using one of the methods described above. The structured image definition contains attributes 

20 that specify which rectangle of the scanned image contains the object data, and also the rotation angle required to 
correct for any orientation skew. Figure 9 is an example of a structured image created in accordance with the previously 
described processes, the structured image containing a pair of rectangular-shaped image objects. 
[0054] In one embodiment of the present inventbn, depicted in Figure 10, the structured image is designed such 
that when rendered, all objects are de-rotated and laid out in a grid fashion. In particular, Figure 10 illustrates a user 

25 interface 400 that may be employed with various aspects of the previously described object shape recognition method 
to provide an intelligent or "smart - platen or scanning system. The smart scanning system represented by Figure 10 
preferably provides a means by which a user can interface with a digitizing scanner to efficiently obtain digitized rep- 
resentations of objects placed on platen 24 of a scanner. 

[0055] For example, referring to Figure 10 in conjunction with Figure 1, a user may place a number of photographs 

30 on the scanner platen. Once placed, the user may then select an operatbn from region 410 of Figure 10 to cause the 
computer system 22 to initiate scanning by scanner 26. As depicted in Figure 10, after the "Gang & Edit" selection 
(412) is made, system 20 scans the objects placed on platen 24 and temporarily stores the data in the file using the 
details reflected in region 420 of the user interface screen. For example, the various image objects (A, B, C and D) 
may be found within an image as illustrated in Figure 11 . Once the image is scanned, it is analyzed as described above 

35 to identify the image objects. The image objects may then be manipulated by the smart scanning system to automatically 
orient and position the images, for example they may be automatically placed in a prespecified template and rendered, 
such as the representation depicted in region 430 of the user interface. It will be appreciated that a user may also be 
given additional edit capability with respect to the template, for example to add captions to the objects, or to include 
titles 432 and subtitles 434 as illustrated. Input for such text-based editing would be accomplished via the user interface 

40 options depicted in region 440. 

[0056] Also enabled by the smart scanning system would be image editing capabilities as illustrated in region 450 
of the user interface. Having identified each of the objects within the image, it is possible to isolate the objects, create 
separate images therefrom, and to then individually process the images. Thus the individual image objects automati- 
cally placed within the template of region 430 may be individually selected, manipulated, scaled (button 452), rotated 

45 (button 454) or cropped (button 456). It will be appreciated that the scaling, rotation and cropping operations are in 
addition to those which are preferably automatically applied by the system as the result of the previously described 
object recognition methods. 

[0057] For example, the image scaling button, illustrated with cross-hatching to depict selection, will allow the user 
to move a cursor (not shown) to select an object (e.g., image object D) and then to drag a side or corner of the object 

50 so as to scale the image object. To facilitate the editing of the objects, control points such as those illustrated about 
the boundary of image object D (436) may be employed in a manner well-known to those who design user interfaces. 
[0058] As noted, a predefined template may be used to automatically "place" image objects in relative positions on 
a document or page thereof. It will be appreciated that such templates may be in the form of a structured image definition, 
so that the template can be used to specify a different layout for the structured image to be generated. Thus, a family 

55 seeking to put its photographs in a "digital photo album" may be able to create a template describing a page similar to 
that shown in region 430 of the user interface. The template would then be used to automatically organize individual 
images or plural objects within a larger document image. 

[0059] In a preferred embodiment, the output would be a structured image output format as described by Venable 
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et ai. in US-A-5,485,568. An important characteristic of structured images is the ability to store image processing 
operations in their description. This means that the structured image can contain image processing operations other 
than simple object deskewing attributes. For example, automatic image enhancement operations may be included 
within the structured image such that the objects identified can be individually enhanced. 
5 [0060] Once the "page" composed in window 430 in the condition desired by the user, the user may save the image 
by selecting the "Save Edited Image" button 460. More importantly, a user may then print or otherwise distribute the 
composed page(s). 

[0061] Although the various embodiments of the present invention have been described with respect to the smart 
scanning system, it will be appreciated that the acquisition of images, and the printing and distribution of the composed 
10 pages can be accomplished via networks or on a walk-up digital copier. For example, a user may have photographs 
automatically scanned by a film processor, and a digitized stamp sheet sent to the user via a network. The stampsheet, 
being in a structured image format could then be processed using the smart scanning system to produce pages of a 
digital photo album with one or more objects on each page. 

[0062] In recapitulation, the present invention is a method and apparatus for processing a digital input image to 
15 characterize a plurality of objects therein. The technique includes: identifying at least one object within the input image 
by characterization of background and foreground pixels; modeling a shape representing boundaries of the object 
using one of two general methods; and creating a description to characterize the object, the description including not 
only shape and location of the object, but object rotation or skew information as well. 

20 

Claims 

1. A method for processing a digital input image to characterize a plurality of objects therein, comprising: 

25 identifying (1 00) at least two objects within the input image; 

modeling (200) a shape representing boundaries of each of the objects; and 
creating (300) a description to characterize the objects. 

2. The method of claim 1 , wherein the description comprises a subset of characteristics selected from the set con- 
30 sisting of: 

a position of the object with respect to the input image; 
a shape of the object; and 

a rotation angle of the object with respect to the input image. 

35 

3. The method of claim 1 or claim 2, wherein the step of identifying at least one object comprises the steps of: 

identifying (102) the background region surrounding the at least two objects; 
smoothing (120) noisy edges within the image using a morphological filtering process; and 
40 locating (122) a contiguous foreground region. 

4. The method of claim 3, wherein the step of identifying the background region comprises: 

creating a binary mask, each location in the mask representing a pixel of the input image; 
45 determining a background color; 

using an adaptive seed fill process, setting each binary location in the mask equal to a first state if the color 
is substantially equal to the background color, otherwise setting if to a second binary state; 
adjusting the background color, if necessary; and 

continuing the process until at least the edges of all objects have been identified. 

so 

5. The method of claim 4, wherein the step of identifying at least two objects further comprises the steps of: 

determining the boundary of each object by tracing (202) an edge; and 

eliminating foreground regions within the input image that are not associated with imagery. 

55 

6. The method of claim 5, wherein the boundary of each object is represented by at least one tracing of the edge and 
where the step of modeling a shape representing boundaries of each object further comprises the step of fitting 
the tracing of the edge to a model shape. 
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7. The method of claim 6, wherein the step of fitting the tracing of the edge to a model shapes comprises: 

decomposing the tracing of the edge into at least one set of points; and 

conducting a least squares analysis (283) of the points to fit the edge to a model shape. 

8. The method of claim 7, wherein the step of decomposing the tracing of the edge comprises: 

calculating (204) the slope at each point along the edge; 

categorizing (206) each point along the edge into one of a plurality of bins, such that all points within any bin 

have slopes that are within a predetermined tolerance; 

refining the bins so as to combine any bins having a substantially similar slope; 

ordering the bins (208) in accordance with the relative number of edge points assigned to each bin; and 

applying a selection criteria to the ordered categories to identify those which represent an edge of the image 

object. 

9. The method of claim 8, wherein the step of applying (210) a selection criteria identifies an N-sided image object 
and wherein the selection criteria is the selection of N bins having the largest number of edge points assigned 
thereto. 

10. The method of claim 8, wherein the selection criteria is statistically determined based upon an analysis of the 
number of edge points assigned to each of the bins, thereby identifying a plurality of bins as representing edges 
of the object. 

11. The method of claim 8, wherein the steps therein are applied only to a subsample of the edge pixels so as to 
reduce the number of calculations necessary to categorize the edge pixels. 

12. The method of claim 11 , lurther comprising the steps of: 

removing any bins that contain less than a predetermined number of edge points; 
determining the average angle of the edge points within each bin; 

combining adjacent groups that share substantially collinear angles so as to reduce the edge trace to repre- 
sentation by a small number of groups. 

13. The method of claim 7, wherein the step of decomposing the tracing of the edge comprises: 

calculating (252) the center of mass of the object; 
locating (254) the edge point closest to the center of mass; 

determining (256) the angle from the center-of-mass to the center point of a longer side of the rectangle; 
using the angle from the center-of-mass to the center point, approximating the length of major and minor axes 
of the rectangle; and 

calculating an angular range, relative to the center-of-mass, for each side of the rectangle that encompasses 
only those edge points associated with a particular side. 

14. The method of claim 6, wherein the step of fitting the tracing of the edge to a model shapes comprises: 

decomposing the tracing of the edge into at least one set of points; and 
conducting a moment analysis of the points to fit the edge to a model shape. 

15. The method of claim 14, wherein the step of conducting a moment analysis comprises: 

obtaining the list of edge points; 

calculating, using the list of edge points, the second order moments for the object; 

characterizing an ellipse from the second order moments including the center of the ellipse (x, y), the lengths 
of each axis (a and b) and the rotation angle (0); and 

from the ellipse, characterizing a bounding box about the object, wherein the bounding box for the rectangular 
object is a rectangle centered at (x,y) with sides of length 2a and 2b, rotated by an angle 6. 

16. The method of claim 15, wherein the length of the sides of the bounding box are decreased so as to crop pixels 
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along at least a portion ot the edge of the object. 

17. An image processing apparatus, including a programmable computer (20) capable of receiving a digitized input 
image, the computer including a frame buffer memory (52) for storing the input image and program memory (52) 

5 for the storage of code suitable for causing the computer to execute image processing operations including: 

identifying a plurality of objects within the digitized input image; 
modeling a shape representing boundaries of the object; and 
creating a description to characterize the object. 

70 

18. The apparatus of claim 17, wherein the description comprises a subset of characteristics selected from the set 
consisting of: 

a position of the object with respect io the input image; 
1$ a shape of the object; and 

a rotation angle of the object with respect to the input image. 

19. The apparatus of claim 17 or 18, wherein the operation of identifying at least one object comprises the steps of: 

20 identifying the background region surrounding the at least two objects; 

smoothing noisy edges within the image using a morphological filtering process; and 
locating a contiguous foreground region. 

20. The apparatus of claim 19, wherein the step of identifying the background region comprises 

25 

creating a binary mask, each location in the mask representing a pixel of the input image, 
determining a background color, 

using an adaptive seed fill process, setting each binary location in the mask equal to a first state if the color 
is substantially equal to the background color, otherwise setting if to a second binary state, 
30 adjusting the background color, if necessary, and continuing the process until at least the edges of all objects 

have been identified; and wherein the step of identifying at least two objects further comprises 
determining the boundary of each object by tracing an edge, and 
eliminating foreground regions within the input image that are not associated with imagery. 

35 21. Reproduction apparatus including image processing apparatus according to any of claims 17 to 20. 
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FIG. 3 




LOCATE 
OBJECTS 



i 

MODEL 
OBJECT 
SHAPES 

I 

CREATE 
STRUCTURED IMAGE 
DESCRIPTION 




EP 0 974 931 A1 




IDENTIFY 
BACKGROUND 
IMAGE 

I 



CHARACTER 
BACKGROUND 
REGION 

I 



IDENTIFY 
BACKGROUND 
REGION 

T 



104 



V 



706 




V 



FIG. 4 



GET SEED 



I 



ALLOCATE 
BINARY FRAME 
BUFFER 

1 



FILL FRAME 
BUFFER USING 
METRIC 

I 



MODIFY 
ADAPTIVE 
AVERAGE 



70S 



110 



112 

« 

« 



-H 



\ 



114 




16 



EP 0 974 931 A1 




FIG. 5 



202 



GET EDGE 
TRACE LIST 



CALCULATE 
SLOPE AT EACH 
POINT 



PLACE EDGE 
POINTS INTO 
SLOPE CATEGORIES 



204 

V 

206 

J 



4 



- -^tolerance) 



SORT SLOPE 
CATEGORIES 



208 

V 



210 



SELECT 
CATEGORIES 



CHARACTERIZE 
SHAPE 

( TO STEP V 
I 280 J 



OF SIDES /SHAPE 



9 



212 

V 



280 



LEAST 
SQUARES FIT 



CONVERT 
PARAMETERS INTO 
SHAPE DESCRIPTION 




17 



EP 0 974 931 A1 




202 



GET EDGE 
TRACE LIST 



± 



CALCULATE 
OBJECT CENTER 
OF MASS 



252 

V 



LOCATE EDGE POINT 
CLOSEST TO 
CENTER OF MASS 



254 

V 



DETERMINE 
ROTATION 
ANGLE 



256 

V 



DETERMINE 

L a 



258 



DETERMINE 



260 

V 



LOCATE EDGE POINT 
CLOSEST TO 
CENTER OF MASS 

I 



262 

V 



LOCATE EDGE POINT 
CLOSESTTO 
CENTER OF MASS 



264 



FIG. 6 




18 



EP 0 974 931 A1 



FIG. 7 



1 



T 



100 



I LOCATE 




OBJECTS 




1 






290 


BOUNDARY 


J 


LIST 






SECOND 


292 


ORDER 




MOMENTS 








BEST 


294 


SHAPE 


y 


FIT 





19 



EP 0 974 931 A1 




20 



EP 0 974 931 A1 



i structured linage verl.O 

Bid : sap - { 

aspect_ratio ■ 1.0 ; 
representation » { 
format » ipd ; 
data - [ 
merge • { 

xy - 0.000000 0.132977 | 
path ■ { 

object « { #first detected object 

size - 0.500000 0.735247 | 
•id s imagel ■ { 

aepect_ratio » 1.470494 ; 
representation ■ { 
format » raster $ 

data • "smp.inf ; # scanned image 

attribute - { 

selection » $sell ; f object rectangle 
derotate - 14,445258 ; f derotation angle 
) 1 
} ; 
) i 
} i 
) ; 
) i 
merge ■ { 

xy - 0.500000 0.180773 ; 
path « { 

object - { # second detected object 

else - 0.500000 0.638454 ; 
aid t imageO * ( 

aspect_ratio ■ 1.276908 j 
representation « { 
format = raster j 

data - ■smp.int* ; # scanned image 

attribute - { 

selection - $sel0 ; # object rectangle 
derotate ■ 12.238364 ; # derotation angle 
) I 
} ; 
) / 
> i 
} ; 
) t 
] ; 
) i 
} i 



selection s sell » Selection: In 
rect i Include 
0.858490 0.958967 



# rectangle of second object 



FIG. 9 



21 



EP 0 974 931 A1 



BEST AVAILABLE COPY 

400 



a 



Smart Scan 



Resolution Q75 Photometry Q Color 



PreScon Scan 



Scon Mode 



Normal Scon 
; Gong & Edit ^ 



Gong Scan 



Gang Extract 



432 

/-A 



^ Alaska '92 



- 'f" wv*'- 



B 




436 



434 

Family Vacation 



Layout _ 420 

[3 Alaska 

Save To: 

Dj r /net/ tpau / local l/tmp 

file SmartScan-a00125-Q 

Options: 

□ Auto Image Enhancement 

Output Units Res (dpi) 

Sire: 323 469 0 p j x S 75 

Editing: 456 (^ avo Ec *'* ec * Irnoge^ ^ 



452 





O 

ran 


n 

(XT 


AIE 


A 


X 






Text. 



430 



(Font ») (Stylo t) (siie (Color *) 
Mouse Buttons: 

[100] Click in box, drag to desired size 



010 



Progress QT 



100 



Not Applicable 
Not Applicable 



Segmentation Params: 
Color Tolerance 7 



EE 



Scanning to Edit . . . Done 



□ Demo Mode 



410 
412 




460 
450 

440 



FIG. 10 



22 



EP 0 974 931 A1 



European Patent 
Office 



EUROPEAN SEARCH REPORT 



Application Number 

EP 98 30 5917 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category 



Citation of document with indication, where appropriate, 
of relevant passages 



Relevant 
to claim 



CLASSIFICATION OF THE 
APPtCATCW (IntOS) 



EP 0 883 287 A (PRIMAX ELECTRONICS LTD) 
9 December 1998 

* column 1, line 39 - line 48 * 

* column 3, line 1 - line 33 * 

* column 4, line 2 - column 5, line 24 * 



1.2 



G06T5/00 



EP 0 505 077 A (HUGHES AIRCRAFT CO) 
23 September 1992 

* column 2, line 51 - column 3, line 

* column 6, line 8 - line 54 * 

* page 2 * 



1-21 



18 * 



EP 0 506 327 A (TEXAS INSTRUMENTS INC) 
30 September 1992 

* abstract * 

* column 2, line 22 - line 41 * 

* column 5, line 46 - line 51 * 



1-21 



TECHNICAL FIELDS 
SEARCHED (IntCU) 



G06T 



The present search report has been drawn up tor afl claims 



THE HAGUE 



26 February 1999 



Gonzalez Ordonez, 0 



CATEGORY OF OTEO DOCUMENTS 

X : parttaJarty relevant d taken atone 

Y • partcoerty relevant i combined «*h anome? 

document of the eame category 
A : technoiogteai background 
O : ncfhwraten dtectosure 
P : intermediate document 



T • theory or pnncarte undertyng the tnvertvn 
E : tarter patent document, but pubfehed on. or 

after the Ring dale 
D : document cfted In the epp<teation 
L : document cited tor ott 



&: member of the s 



a patent famdy. correoporxirig 



23 



EP 0 974 931 A1 



ANNEX TO THE EUROPEAN SEARCH REPORT 
ON EUROPEAN PATENT APPLICATION NO. 



EP 98 30 5917 



This annex lists Ihe patent family members relating to the patent documents cited in (he above-mentioned European search report 
The members are as contained in the European Patent Office EDP He on 

The European Patent Office is in no way liable for these particulars wmeh are merely given for the purpose of information. 

26-02-1999 



Patent document 
cited in search report 


Publication 
date 


Patent family 
member's) 


Publication 
date 


EP 0883287 


A 


09-12-1998 


NONE 






IP 0505077 


A 


23-09-1992 


US 
CA 
JP 


5265173 A 
2061313 A 
5101183 A 


23-11-1993 
21-09-1992 
23-04-1993 


EP 0506327 


A 


30-09-1992 


US 


5210799 A 


11-05-1993 








DE 


69226551 D 


17-09-1998 








0E 


69226551 T 


24-12-1998 








JP 


6124344 A 


06-05-1994 








US 


5566246 A 


15-10-1996 



2 



ui For mote details about this annex : see Official Journal of the European Patent Office. No. 12/82 



24 



